ABSTRACT Title of Dissertation / Thesis: COMPUTATIONAL ANALYSES OF MICROBIAL GENOMES – OPERONS, PROTEIN FAMILIES AND LATERAL GENE TRANSFER
نویسنده
چکیده
Title of Dissertation / Thesis: COMPUTATIONAL ANALYSES OF MICROBIAL GENOMES – OPERONS, PROTEIN FAMILIES AND LATERAL GENE TRANSFER. Yongpan Yan, Doctor of Philosophy, 2005 Dissertation / Thesis Directed By: Professor John Moult, Center for Advanced Biotechnology Research, University of Maryland Biotechnology Institute As a result of recent successes in genome scale studies, especially genome sequencing, large amounts of new biological data are now available. This naturally challenges the computational world to develop more powerful and precise analysis tools. In this work, three computational studies have been conducted, utilizing complete microbial genome sequences: the detection of operons, the composition of protein families, and the detection of the lateral gene transfer events. In the first study, two computational methods, termed the Gene Neighbor Method (GNM) and the Gene Gap Method (GGM), were developed for the detection of operons in microbial genomes. GNM utilizes the relatively high conservation of order of genes in operons, compared with genes in general. GGM makes use of the relatively short gap between genes in operons compared with that otherwise found between adjacent genes. The two methods were benchmarked using biological pathway data and documented operon data. Operons were predicted for 42 microbial genomes. The predictions are used to infer possible functions for some hypothetical genes in prokaryotic genomes and have proven a useful adjunct to structure information in deriving protein function in our structural genomics project. In the second study, we have developed an automated clustering procedure to classify protein sequences in a set of microbial genomes into protein families. Benchmarking shows the clustering method is sensitive at detecting remote family members, and has a low level of false positives. The aim of constructing this comprehensive protein family set is to address several questions key to structural genomics. First, our study indicates that approximately 20% of known families with three or more members currently have a representative structure. Second, the number of apparent protein families will be considerably larger than previously thought: We estimate that, by the criteria of this work, there will be about 250,000 protein families when 1000 microbial genomes are sequenced. However, the vast majority of these families will be small. Third, it will be possible to obtain structural templates for 70 – 80% of protein domains with an achievable number of representative structures, by systematically sampling the larger families. The third study is the detection of lateral gene transfer event in microbial genomes. Two new high throughput methods have been developed, and applied to a set of 66 fully sequenced genomes. Both make use of a protein family framework. In the High Apparent Gene Loss (HAGL) method, the number and nature of gene loss events implied by classical evolutionary descent is analyzed. The higher the number of apparent losses, and the smaller the evolutionary distance over which they must have occurred, the more likely that one or more genes have been transferred into the family. The Evolutionary Rate Anomaly (ERA) method associates transfer events with proteins that appear to have an anomalously low rate of sequence change compared with the rest of that protein family. The methods are complementary in that the HAGL method works best with small families and the ERA method best with larger ones. The methods have been parameterized against each other, such that they have high specificity (less than 10% false positives) and can detect about half of the test events. Application to the full set of genomes shows widely varying amounts of lateral gene transfer. COMPUTATIONAL ANALYSES OF MICROBIAL GENOMES – OPERONS, PROTEIN FAMILIES AND LATERAL GENE TRANSFER
منابع مشابه
Genetic transfer in Staphylococcus: a case study of 13 genomes
The widespread presence of antibiotic resistance and virulence among Staphylococcus isolates has been attributed to lateral genetic transfer (LGT) between different strains or species. However, there has been very little study of the extent of LGT in Staphylococcus species using a phylogenetic approach, particularly of the units of such genetic transfer. Here we report the first systematic stud...
متن کاملThree 2-oxoacid dehydrogenase operons in Haloferax volcanii: expression, deletion mutants and evolution.
Two unrelated protein families catalyse the oxidative decarboxylation of 2-oxoacids, i.e. the 2-oxoacid dehydrogenase complexes (OADHCs) and the 2-oxoacid ferredoxin oxidoreductases (OAFORs). In halophilic archaea, OAFORs were found to be responsible for decarboxylation of pyruvate and 2-oxoglutarate. Nevertheless, two gene clusters encoding OADHCs were found previously in Haloferax volcanii, b...
متن کاملAutomating the Search for Lateral Gene Transfer
Most genes have attained their observed distribution among genomes by transmission from parent to offspring through time. In prokaryotes (bacteria and archaea), however, some genes are where they are as the result of transfer from an unrelated lineage. To elucidate the biological origins and functional consequences of lateral gene transfer (LGT), we have constructed an automated computational p...
متن کاملEvolutionary Origins of Genomic Repertoires in Bacteria
Explaining the diversity of gene repertoires has been a major problem in modern evolutionary biology. In eukaryotes, this diversity is believed to result mainly from gene duplication and loss, but in prokaryotes, lateral gene transfer (LGT) can also contribute substantially to genome contents. To determine the histories of gene inventories, we conducted an exhaustive analysis of gene phylogenie...
متن کاملIndependent and parallel lateral transfer of DNA transposons in tetrapod genomes.
In animals, the mode of transmission of transposable elements is generally vertical. However, recent studies have suggested that lateral transfer has occurred repeatedly in several distantly related tetrapod lineages, including mammals. Using transposons extracted from the genome of the lizard Anolis carolinensis as probes, we identified four novel families of hAT transposons that share extreme...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005